QuickTime 4 Reference

Missing Functionality and Limitations

In the two previous examples, you need to map between numerical audio sample values in a computer and some real-world parameter (volts and Y-coordinates). The audio samples in those examples were represented with integers, and so the numerical slope and intercept values to use were fairly obvious.

With floating point data, there is no such obvious assignment. For example, a piece of code may expect or generate -1.0 to 1.0 data, 0.0 to 1.0 data, or -32678.0 to 32767.0 data. This ambiguity actually offers a performance advantage: rather than constantly rescaling output audio data to some standard range after each processing operation, you can often defer the rescale operation until the last minute, when data needs to be played or displayed.

In order for this to work, the slope and intercept parameters described in the previous section must accompany the audio data as it is processed. Each stage of processing "knows" what its algorithm does to the slope and intercept values, and modifies them appropriately.

The issue comes when you store the floating-point audio data into a file. Most current floating point audio file formats do not allow you to store slope and intercept values along with the data. A program that loads such a file and wants to play, display, or further process it must first scan through the entire data set to guess at full amplitude and zero amplitude levels, and even this is not guaranteed to recover the correct values.

This latter scenario is what occurs with floating-point support in NeXT/Sun files. Any attempt to interpret the values in such a soundfile has to be preceded by a long and painful scan step, which renders the file format nearly useless.

The Berkeley/IRCAM/CARL (BICSF) soundfile includes a "maximum amplitude" tag, which must always contain the highest sample absolute value present in the file. The writer of a BICSF file is responsible for computing and storing this maximum amplitude, which in some cases means scanning all of the file's samples, before closing the file.

This scheme is useful in cases where you always want to "normalize" audio samples for playback and display, meaning that you always want the "loudest" part of the audio data to play at the maximum level of which the D/A is capable, and display at the maximum amplitude on a waveform display.

But this is not always what you want. Sometimes, you want the "loudest" part of your audio file to represent some fraction of the maximum level of your D/A device or waveform display. For example, this is often the case in mixdowns. The slope parameter defined here does not have to be the same as (or even greater than) the maximum sample absolute value in the file.

Missing Functionality and Limitations

Defining a Superset of the BICSF Functionality

Enhancing Functionality with Slope and Intercept Parameters

Enhancing Performance

Further Performance Improvement for Clipping By Using minclip and maxclip Parameters

Using Slope and Intercept

Using minclip and maxclip